Spoof Detection Using Source, Instantaneous Frequency and Cepstral Features

نویسندگان

Sarfaraz Jelil

Rohan Kumar Das

S. R. Mahadeva Prasanna

Rohit Sinha

چکیده

This work describes the techniques used for spoofed speech detection for the ASVspoof 2017 challenge. The main focus of this work is on exploiting the differences in the speech-specific nature of genuine speech signals and spoofed speech signals generated by replay attacks. This is achieved using glottal closure instants, epoch strength, and the peak to side lobe ratio of the Hilbert envelope of linear prediction residual. Apart from these source features, the instantaneous frequency cosine coefficient feature, and two cepstral features namely, constant Q cepstral coefficients and mel frequency cepstral coefficients are used. A combination of all these features is performed to obtain a high degree of accuracy for spoof detection. Initially, efficacy of these features are tested on the development set of the ASVspoof 2017 database with Gaussian mixture model based systems. The systems are then fused at score level which acts as the final combined system for the challenge. The combined system is able to outperform the individual systems by a significant margin. Finally, the experiments are repeated on the evaluation set of the database and the combined system results in an equal error rate of 13.95%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effectiveness of fundamental frequency (F0) and strength of excitation (SOE) for spoofed speech detection

Current countermeasures used in spoof detectors (for speech synthesis (SS) and voice conversion (VC)) are generally phase-based (as vocoders in SS and VC systems lack phaseinformation). These approaches may possibly fail for nonvocoder or unit-selection-based spoofs. In this work, we explore excitation source-based features, i.e., fundamental frequency (F0) contour and strength of excitation (S...

متن کامل

Unsupervised Representation Learning Using Convolutional Restricted Boltzmann Machine for Spoof Speech Detection

Speech Synthesis (SS) and Voice Conversion (VC) presents a genuine risk of attacks for Automatic Speaker Verification (ASV) technology. In this paper, we use our recently proposed unsupervised filterbank learning technique using Convolutional Restricted Boltzmann Machine (ConvRBM) as a frontend feature representation. ConvRBM is trained on training subset of ASV spoof 2015 challenge database. A...

متن کامل

Novel Subband Autoencoder Features for Detection of Spoofed Speech

Deep Neural Network (DNN) have been extensively used in Automatic Speech Recognition (ASR) applications. Very recently, DNNs have also found application in detecting natural vs. spoofed speech at ASV spoof challenge held at INTERSPEECH 2015. Along the similar lines, in this work, we propose a new feature extraction architecture of DNN called the subband autoencoder (SBAE) for spoof detection ta...

متن کامل

Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech

Speech synthesis and voice conversion techniques can pose threats to current speaker verification (SV) systems. For this purpose, it is essential to develop front end systems that are able to distinguish human speech vs. spoofed speech (synthesized or voice converted). In this paper, for the ASVspoof 2015 challenge, we propose a detector based on combination of cochlear filter cepstral coeffici...

متن کامل

Feature extraction from analytic phase of speech signals for speaker verification

The objective of this work is to study the speaker-specific nature of analytic phase of speech signals. Since computation of analytic phase suffers from phase wrapping problem, we have used its derivativethe instantaneous frequency for feature extraction. The cepstral coefficients extracted from smoothed subband instantaneous frequencies (IFCC) are used as features for speaker verification. The...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Spoof Detection Using Source, Instantaneous Frequency and Cepstral Features

نویسندگان

چکیده

منابع مشابه

Effectiveness of fundamental frequency (F0) and strength of excitation (SOE) for spoofed speech detection

Unsupervised Representation Learning Using Convolutional Restricted Boltzmann Machine for Spoof Speech Detection

Novel Subband Autoencoder Features for Detection of Spoofed Speech

Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech

Feature extraction from analytic phase of speech signals for speaker verification

عنوان ژورنال:

اشتراک گذاری